101 research outputs found

    Comparing Multi-objective and Threshold-moving ROC Curve Generation for a Prototype-based Classifier

    Get PDF
    Proceedings of: GECCO 2013: 15th International Conference on Genetic and Evolutionary Computation Conference (Amsterdam, The Netherlands, July 06-10, 2013): a recombination of the 22nd International Conference on Genetic Algorithms (ICGA) and the 18th Annual Genetic Programming Conference (GP), Amsterdam, The Netherlands, July 06-10, 2013Receiver Operating Characteristics (ROC) curves represent the performance of a classifier for all possible operating con-ditions, i.e., for all preferences regarding the tradeoff be-tween false positives and false negatives. The generation of a ROC curve generally involves the training of a single classifier for a given set of operating conditions, with the subsequent use of threshold-moving to obtain a complete ROC curve. Recent work has shown that the generation of ROC curves may also be formulated as a multi-objective optimization problem in ROC space: the goals to be min-imized are the false positive and false negative rates. This technique also produces a single ROC curve, but the curve may derive from operating points for a number of different classifiers. This paper aims to provide an empirical compar-ison of the performance of both of the above approaches, for the specific case of prototype-based classifiers. Results on synthetic and real domains shows a performance advantage for the multi-objective approach.GECCO 2013 Presentation slidesThis work has been funded by the Spanish Ministry of Science under contract TIN2011-28336 (MOVES project)En prens

    Artefacts and biases affecting the evaluation of scoring functions on decoy sets for protein structure prediction

    Get PDF
    Motivation: Decoy datasets, consisting of a solved protein structure and numerous alternative native-like structures, are in common use for the evaluation of scoring functions in protein structure prediction. Several pitfalls with the use of these datasets have been identified in the literature, as well as useful guidelines for generating more effective decoy datasets. We contribute to this ongoing discussion an empirical assessment of several decoy datasets commonly used in experimental studies

    Multiobjective optimization in bioinformatics and computational biology

    Get PDF

    Towards a fairer reimbursement system for burn patients using cost-sensitive classification

    Get PDF
    The adoption of the Prospective Payment System (PPS) in the UK National Health Service (NHS) has led to the creation of patient groups called Health Resource Groups (HRG). HRGs aim to identify groups of clinically similar patients that share similar resource usage for reimbursement purposes. These groups are predominantly identified based on expert advice, with homogeneity checked using the length of stay (LOS). However, for complex patients such as those encountered in burn care, LOS is not a perfect proxy of resource usage, leading to incomplete homogeneity checks. To improve homogeneity in resource usage and severity, we propose a data-driven model and the inclusion of patient-level costing. We investigate whether a data-driven approach that considers additional measures of resource usage can lead to a more comprehensive model. In particular, a cost-sensitive decision tree model is adopted to identify features of importance and rules that allow for a focused segmentation on resource usage (LOS and patient-level cost) and clinical similarity (severity of burn). The proposed approach identified groups with increased homogeneity compared to the current HRG groups, allowing for a more equitable reimbursement of hospital care costs if adopted.Comment: Joint KDD 2021 Health Day and 2021 KDD Workshop on Applied Data Science for Healthcare: State of XAI and trustworthiness in Healt

    Publication outperformance among global South researchers: An analysis of individual-level and publication-level predictors of positive deviance

    Get PDF
    From Springer Nature via Jisc Publications RouterHistory: received 2020-12-19, accepted 2021-08-05, registration 2021-08-06, pub-electronic 2021-09-13, online 2021-09-13, pub-print 2021-10Publication status: PublishedAbstract: Research and development are central to economic growth, and a key challenge for countries of the global South is that their research performance lags behind that of the global North. Yet, among Southern researchers, a few significantly outperform their peers and can be styled research ā€œpositive deviantsā€ (PDs). In this paper we ask: who are those PDs, what are their characteristics and how are they able to overcome some of the challenges facing researchers in the global South? We examined a sample of 203 information systems researchers in Egypt who were classified into PDs and non-PDs (NPDs) through an analysis of their publication and citation data. Based on six citation metrics, we were able to identify and group 26 PDs. We then analysed their attributes, attitudes, practices, and publications using a mixed-methods approach involving interviews, a survey and analysis of publication-related datasets. Two predictive models were developed using partial least squares regression; the first predicted if a researcher is a PD or not using individual-level predictors and the second predicted if a paper is a paper of a PD or not using publication-level predictors. PDs represented 13% of the researchers but produced about half of all publications, and had almost double the citations of the overall NPD group. At the individual level, there were significant differences between both groups with regard to research collaborations, capacity development, and research directions. At the publication level, there were differences relating to the topics pursued, publication outlets targeted, and paper features such as length of abstract and number of authors
    • ā€¦
    corecore